Skip to main content
TrustRadius
Azure AI Speech

Azure AI Speech
Formerly Azure Cognitive Speech Services

Overview

What is Azure AI Speech?

The Azure AI Speech service provides a range of speech recognition and generation capabilities including speech transcription, text-to-speech and speech translation. It provides a range of speech recognition and generation capabilities including speech transcription, text-to-speech, speech translation, and speaker recognition.

Read more
Recent Reviews
Read all reviews

Awards

Products that are considered exceptional by their customers based on a variety of criteria win TrustRadius awards. Learn more about the types of TrustRadius awards to make the best purchase decision. More about TrustRadius Awards

Return to navigation

Pricing

View all pricing

Entry-level set up fee?

  • No setup fee
For the latest information on pricing, visithttps://azure.microsoft.com/en…

Offerings

  • Free Trial
  • Free/Freemium Version
  • Premium Consulting/Integration Services

Starting price (does not include set up fee)

  • $1 per month
Return to navigation

Product Details

What is Azure AI Speech?

The Speech service is the unification of speech-to-text, text-to-speech, and speech-translation into a single Azure subscription. It's speech capabilities enable applications, tools, and devices with the Speech CLI, Speech SDK, Speech Devices SDK, Speech Studio, or REST APIs.

Services include:

Speech to Text - Transcribe audio in more than 92 languages and variants. Gain customer insights with call center transcription, improve experiences with voice-enabled assistants, and capture key discussions in meetings.

Text to Speech - Create apps and services that speak conversationally, choosing from more than 215 voices, and 60 languages and variants. Create natural-sounding audio content, improve accessibility with read-aloud functionality, and create custom voice assistants.

Speech Translation - Translate audio from more than 30 languages and customize translations for organization's specific terms in a preferred programming language.

Speaker Recognition - Confirm a person's identity or recognize who's speaking in a meeting by adding speaker verification and identification to an app.

Custom Commands - Users can build a touchless, voice-first experience to improve safety and support back-to-work scenarios.

Custom Keywords - Custom keyword for IoT devices and voice-enabled assistants to set your brand apart—making it more personal, personable, and secure.

Azure AI Speech Technical Details

Deployment TypesSoftware as a Service (SaaS), Cloud, or Web-Based
Operating SystemsUnspecified
Mobile ApplicationNo

Frequently Asked Questions

The Azure AI Speech service provides a range of speech recognition and generation capabilities including speech transcription, text-to-speech and speech translation. It provides a range of speech recognition and generation capabilities including speech transcription, text-to-speech, speech translation, and speaker recognition.

Azure AI Speech starts at $1.

The most common users of Azure AI Speech are from Enterprises (1,001+ employees).
Return to navigation

Comparisons

View all alternatives
Return to navigation

Reviews and Ratings

(16)

Attribute Ratings

Reviews

(1-3 of 3)
Companies can't remove reviews or game the system. Here's why
Score 8 out of 10
Vetted Review
Verified User
Incentivized
We use Azure Cognitive Speech Services to add speech to text, text to speech, and other AI-driven NLP-related speech services to our customised applications esp those involving chatbots for different business functions. The idea was to make use of speech services for mobile apps to make them hands-free and more accessible. The range of languages helped especially from an Indian context as only one competitor product could support as many Indian languages apart from a few European and middle eastern ones.
  • APIs offered are very robust.
  • Languages supported is far greater than most of its competitors.
  • Integration with our custom apps was easy.
  • Speech models that we created using neural voices were quite impressive.
  • Translation services worked really well.
  • Built in machine learning opens it to a lot more business use cases for the future.
  • At times different accents can be an issue but over time with more data, this can be further improved esp with reinforcement learning.
  • Price is on the higher side so ROI is slow to realise.
  • For community development, perhaps some of its source code could be open-sourced for further engagement and development as the overall community is small.
Excellent for voice enabled apps
built in security so speech data does not go outside
Flexible deployment on the cloud
Speech translation in real time scenarios
Using customised keywords to activate IoT devices
  • Text to speech.
  • Speech to text.
  • Translation APIs.
  • Customizable keywords.
  • Integration with 3rd party apps.
  • Ease of deployment on the cloud.
  • Although it takes time our apps powered by speech services gave us good ROI.
  • Made our products stand out in finance, hr and operations functions.
  • Gave much-needed AI-powered machine learning integration through NLP offered by azure.
  • Our chat assistants became more user friendly and thus UX increased.
Price is the number one factor which stands out for Azure among its competitor's Number of languages supported esp from an Indian context also is quite remarkable as opposed to its competitors, the vocabulary and accent support therein also matters. Its cloud-first deployment strategy also makes app deployment very easy.
Google Drive, IBM Cognos Analytics with Watson, Automation Anywhere, Microsoft 365 (formerly Office 365), Sophos Intercept X, Jira Software, VMware Blockchain, Broadcom Test Data Manager (formerly CA Test Data Manager)
No
  • Price
  • Product Features
  • Product Usability
  • Product Reputation
Price. Being a relatively newer technology, speech services with additional features are expensive. Hence we found Azure Cognitive Speech Services at least in our budget. Us being office users also helped, relationship-wise.
We may evaluate more on our products integrating with product demos to save time instead of 3rd party products.
Excellent knowledgeable sales team.
After sales support was fantastic
Price
Support package.
APIs
Future version upgrades.
Be honest, take them through your vision and goals, be realistic and practical. Be very upfront about budgets and RoIs envisaged.
Score 7 out of 10
Vetted Review
Verified User
Incentivized
It is one of the most advanced software available. Through its advanced features, it recognizes even distorted noise efficiently. We can effectively convert speech-to-text and text-to-voice, which helps us communicate, make notes, and accurately discover requirements.
  • The free version provides up to five hours of audio and allows you to create one custom voice model per month.
  • Microsoft's language processing system justifies the cost of the software - it recognizes even faint and distorted sound in many cases.
  • It works with many languages and dialects which helps understand many speeches.
  • The software is not user-friendly- it has a complicated interface and requires a lot of training to set up.
  • The pricing is also costly - so for an individual user, not on a company plan, this is not affordable.
Azure Cognitive Speech Services can work with many languages and dialects, making it imperative for people working with multi-lingual clients. It also helps to catch speeches while conversing in meetings. The setup is complicated, which is why for a novice user - it is not an easy endeavor to use. The pricing is also high.
  • Text-to-Speech module
  • Speech-to-text module
  • It helps us catch requirements and make notes so that we don't forget when drawing out proposals.
  • Also helps us with targeted pitches and helps save time.
Score 10 out of 10
Vetted Review
Verified User
Incentivized
We mainly used Azure Cognitive Speech Services for text to speech and speech to text use cases to take note of the things we say to our client. As a technical support engineer, we say a lot of pointers and reminders to our clients and we have to make sure that we also know what we previously had told them so this is very important for us.
  • Easily leverage the available APIs even with the free version.
  • Accurate speech to text.
  • Local languages are supported.
  • Pricing is costly, it seems MS is forcing you to opt with the premium version since you can only use worth five hours of free translations.
Since this is made by Microsoft, integration wouldn't be a huge obstacle with your O365 applications. You just have to check if your apps are available for integration, use the available APIs then you are good to go.
  • Speech to text analysis.
  • Can read and analyze local languages.
  • Technical support engineers were able to review the things they have said and have logs to each customers.
  • Acoustic Analytics (formerly IBM Watson Customer Experience Analytics)
Just used this once but for me Azure is much easier since the company uses Office 365.
Acoustic Analytics (formerly IBM Watson Customer Experience Analytics)
Return to navigation